System and process for developing a voice application

ABSTRACT

A system for use in developing a voice application, including a dialog element selector for defining execution paths of the application by selecting dialog elements and adding the dialog elements to a tree structure, each path through the tree structure representing one of the execution paths, a dialog element generator for generating the dialog elements on the basis of predetermined templates and properties of the dialog elements, the properties templates received from a user of the system, each of said dialog elements corresponding to at least one voice language template, and a code generator for generating at least one voice language module for the application on the basis of said at least one voice language template and said properties. The voice language templates include VoiceXML elements, and the dialog elements can be regenerated from the voice language module. The voice language module can be used to provide the voice application for an IVR.

FIELD OF THE INVENTION

The present invention relates to a system and process for generating avoice application.

BACKGROUND

A voice application is a software application that provides aninteractive audio interface, particularly a speech interface, on amachine, such as an Interactive Voice Response (IVR) system. IVRs, suchas Intel's Dialogic™ IVR, are used in communications networks to receivevoice calls from parties. The IVR is able to generate and send voiceprompts to a party and receive and interpret the party's responses madein reply.

Voice extensible markup language, or VoiceXML, is a markup language forvoice or speech-driven applications. VoiceXML is used for developingspeech-based telephony applications, and also enables web-based contentto be accessed via voice using a telephone. VoiceXML is being developedby the VoiceXML Forum, and is described at http://www.voicexml.org. Dueto the verbose nature of VoiceXML, it can be cumbersome to developVoiceXML-based applications manually using a text or XML editor.Consequently, voice application development systems are available thatallow voice applications to be developed by manipulating graphicalelements via a graphical user interface rather than coding VoiceXMLdirectly. However, these systems are limited in their ability to assista developer. It is desired to provide a process and system fordeveloping a voice application that improves upon the prior art, or atleast provide a useful alternative to existing voice applicationdevelopment systems and processes.

SUMMARY OF THE INVENTION

In accordance with the present invention, there is provided a processfor developing a voice application, including:

-   -   generating graphical user interface components for defining        execution paths of said application by arranging dialog elements        in a tree structure, each path through said tree structure        representing one of said execution paths;    -   generating said dialog elements on the basis of predetermined        templates and properties of said dialog elements, said        properties received from a user via said graphical user        interface components, each of said dialog elements corresponding        to at least one voice language template; and    -   generating at least one voice language module for said        application on the basis of said at least one voice language        template and said properties.

The present invention also provides a system for use in developing avoice application, including:

-   -   a dialog element selector for defining execution paths of said        application by selecting dialog elements and adding said dialog        elements to a tree structure, each path through said tree        structure representing one of said execution paths;    -   a dialog element generator for generating said dialog elements        on the basis of predetermined templates and properties of said        dialog elements, said properties received from a user of said        system, each of said dialog elements corresponding to at least        one voice language template; and    -   a code generator for generating at least one voice language        module for said application on the basis of said at least one        voice language template and said properties.

The present invention also provides a graphical user interface for usein developing a voice application, said interface including graphicaluser interface components for defining execution paths of saidapplication by arranging configurable dialog elements in a treestructure, each path through said tree structure representing one ofsaid execution paths, and said dialog element components may include oneor more of:

-   -   a start dialog component for defining the start of said        application;    -   a variables component for use in defining variables for said        application;    -   a menu component for defining a menu;    -   a menu choice component for defining a choice of said menu;    -   a decision component for defining a decision branching point;    -   a decision branch component for defining a test condition and an        execution branch of said decision branching point;    -   a form component for defining a form to collect input from a        caller;    -   a record component for recording audio    -   a speaker component for playing prompts,    -   a local processing component for defining local processing;    -   a remote processing component for performing processing on a        remote system;    -   a loop component for defining an execution loop;    -   a loop call component for calling said loop;    -   a loop next component for proceeding to the next cycle of said        loop;    -   a loop break component for breaking out of said loop;    -   a subroutine component for defining a subroutine;    -   a subroutine call component for calling said subroutine;    -   subroutine return component for returning from said subroutine;    -   a jump component for defining a non-sequential execution path to        a dialog element    -   a transfer component representing the transfer of a call to        another number    -   a hotwords component for defining a word or phrase and a        non-sequential execution path to a dialog element to be followed        upon receipt of said word or phrase; and    -   an end component for defining an end of said application.

BRIEF DESCRIPTION OF THE DRAWINGS

Preferred embodiments of the present invention are hereinafterdescribed, by way of example only, with reference to the accompanyingdrawings, wherein:

FIG. 1 is a block diagram showing a preferred embodiment of a voiceapplication development system connected to an IVR via a network, and atelephone connected to the IVR via the PSTN;

FIG. 2 is a schematic diagram of the voice application developmentsystem, showing how a voice application is developed;

FIG. 3 is a flow diagram of a voice application development processexecuted by the system;

FIG. 4 is a screenshot of a graphical user interface of the voiceapplication development system;

FIG. 5 is a screenshot of a dialog element selection bar of thegraphical user interface; and

FIG. 6 is a flow diagram of a code generation process executed by thesystem.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

As shown in FIG. 1, a voice application development system 100 can beconnected to a VoiceXML-enabled interactive voice response system (IVR)102 via a communications network 104. The system 100 executes a voiceapplication development process which allows an application developer todevelop a speech based application using a graphical user interface ofthe system 100. The application can then be transferred to the IVR 102via the network 104. A standard telephone 106 can be used to access theIVR 102 via the public switched telephone network (PSTN) 108, allowing auser of the telephone 106 to interact with the speech based applicationsimply by speaking into the telephone 106 to provide speech input to theapplication in response to voice prompts provided by the IVR 102 fromthe application. In the described embodiment, the voice applicationdevelopment system 100 is a standard computer system, such as anIntel™-based personal computer running a Microsoft Windows™ operatingsystem, and the voice application development process is implemented bysoftware modules stored on hard disk storage of the voice applicationdevelopment system 100. However, it will be apparent to those skilled inthe art that at least parts of the voice application development processcan be alternatively implemented by dedicated hardware components suchas application-specific integrated circuits (ASICs). The voiceapplication runs on the IVR 102, which may be an Intel Dialogic™ IVRwith Nuance's Voice Web Server™ software. The network 104 may be anysecure communications network that enables voice applications to beloaded onto the IVR 102, such as an Ethernet LAN or TCP/IP network.

As shown in FIG. 2, the voice application development system 100includes a dialog editor module 202, a dialog transformer module 204, anapplication builder module 206, an CR code generator 208, otherapplication development modules 210, VoiceXML templates 212, andEMCAscript templates 230. The voice application development system 100constitutes an integrated development environment (IDE) for thedevelopment of speech based applications. The system 100 executes anapplication development process, as shown in FIG. 3, that allows a userof the system 100 to develop a voice application for a particular IVRplatform.

When the process begins, the system 100 generates a graphical userinterface, as shown in FIG. 4. The interface is in the form of a window400 with a project pane 402, a tools pane 404, and a messages pane 406.The window 400 also includes a main menubar 408 and a main toolbar 410.The main menubar 408 includes a Tools menu that provides access to anumber of modules of the system 100 that are used to develop voiceapplications, as described below, and the tools pane 404 provides aninterface to each tool when that tool is executed.

To develop a speech based application, a user of the system 100 cancreate a new project or open a saved project by selecting acorresponding menu item from the “Files” menu of the main menubar 408.The dialog editor 202 is then executed, and a tabbed dialog panel 411 isadded to the tools pane 404, providing an interface to the dialog editor202, and allowing the user to define an execution flow, referred to as adialog, for the application. The dialog panel 411 includes a dialog pane412, a dialog element toolbar 414 referred to as the dialog palette, adialog element properties pane 416, and a dialog element help pane 418.

An application can be built from a set of seventeen dialog elementsrepresented by icons in the dialog palette 414, as shown in FIGS. 4 and5. Each element represents a complete or partial component of the voiceapplication, such as a menu, a menu choice, a form, an execution loop, aspeech prompt, and so on. The full set of dialog components is given inAppendix A. A dialog element is added to the dialog by selecting theelement from the dialog palette 414 using a pointing device of thesystem 100 such as a mouse or tablet, and placing the selected dialogelement on the dialog editor pane 412 using a drag-and-drop action. Eachdialog element has a number of properties that can be set by the user.Once placed in the dialog editor pane 412, an instance of the selecteddialog element is added to the dialog and its properties can be set.When a dialog element instance in the dialog editor pane 412 isselected, its property names and associated values are displayed in theproperties pane 416. The properties pane displays the name of eachproperty, and includes controls such as check boxes and buttons to allowthe user to modify the values of existing properties, and to add ordelete new properties. The dialog element help pane 418 displays helpinformation for the selected element, facilitating the rapid developmentof the application.

The execution flow of the application is defined by adding dialogelements to the dialog editor pane 412, setting the properties of thedialog elements, and defining the execution order of the dialogelements. The latter is achieved by dragging a dialog element anddropping it on top of an existing dialog element in the dialog editorpane 412. The dropped element becomes the next element to be executedafter the element that it was dropped onto. The sequence and propertiesof dialog elements on the dialog editor pane 412 defines a dialog. Thusa dialog represents the execution flow of a voice application as asequence of dialog elements. This sequence represents the main flow ofthe application and provides a higher-level logical view of theapplication that is not readily evident from the application's VoiceXMLcode. Thus the dialog provides a clear and logical view of the executionof the application. In addition to the main flow, non-sequentialexecution branches can be created by using a Jump dialog element.However, such non-sequential execution is not represented in a dialog. Asubroutine is represented by an icon in the project pane 402 and appearsas an icon in the dialog editor pane 412 when the main dialog isdisplayed. The execution flow of a subroutine can be displayed byselecting its icon in the project pane 402.

The sequencing of a dialog is facilitated by enforcing strict rules ondialog elements and by including explicit links in the dialog code totransition from one dialog element to the next. In contrast to arbitraryVoiceXML code whose execution can be completely non-sequential due tothe presence of “GOTO” tags, a dialog generated by the system 100 has atree structure, with each path through the tree representing a possiblepath of dialog execution. This allows the dialog flow to be readilydetermined and displayed using high level graphical dialog elements,which would be much more difficult with arbitrary VoiceXML.

An application can be saved at any time by selecting a “Save” menu itemof the “File” menu of the menubar 410. When an application is saved, theapplication dialog is translated into an extended VoiceXML format by thedialog transformer 204. Each dialog element in the dialog flow is firsttranslated into corresponding VoiceXML code. Each of the seventeendialog elements corresponds to one of the seventeen VoiceXML templates212 that performs the functionality of that element. A VoiceXML templateis a sequence of VoiceXML elements that produces the behaviour that thedialog element represents. It is a template because it needs to beconfigured by the element properties (e.g., name, test condition) whichare set by the user, as described above.

Some dialog elements correspond to similar VoiceXML elements (e.g., aMenu dialog element corresponds to a VoiceXML <menu> element), whileothers map onto a complex sequence of VoiceXML elements (e.g., a Loopdialog element corresponds to multiple VoiceXML <form> elements, eachform specifying the next form to execute in an iterative loop). However,even dialog elements that correspond to similar VoiceXML elementsrepresent more functionality than the equivalent VoiceXML element. Forexample, a Menu dialog element allows prompts to be set by the user, andthe Menu dialog element actually maps onto a block of VoiceXML code thatcontains a <menu> element with embedded <prompt>, <audio>, and other XMLelements.

Each dialog element's VoiceXML template is separate from the next andcan be sequenced to produce the dialog flow. The sequencing is achievedby a reference at the bottom of each element's template to the nextelement's template, which causes the templates to be executed in thedesired order.

The translation from high-level dialog elements into VoiceXML proceedsas follows. The dialog elements are stored in a tree structure, eachbranch of the tree corresponding to a path in the dialog flow. The treeis traversed in pre-order traversal to convert each element visited intoVoiceXML. For each visited dialog element, VoiceXML code is generatedfrom its corresponding VoiceXML template by filling in the missing orconfigurable parts of the template using the element properties set bythe user, and adding a link to the next element's VoiceXML code at thebottom of the current element's generated VoiceXML code.

Although the forward transformation from dialog flow to VoiceXML isrelatively straightforward, the reverse transformation from VoiceXML todialog flow is more difficult. The sequencing of dialog elements can berecreated from the generated VoiceXML, but property settings for theelements may not be available because some information in the dialogelements is lost when they are converted to VoiceXML. This lostinformation may not fall within the scope of VoiceXML, and hence, cannotbe naturally saved in VoiceXML code. For example, type information for aForm element is used to generate the grammar for that Form. However, theVoiceXML code simply needs to reference the generated Grammar File andis not concerned with the type information itself. Thus, the mapping ofthe Form element to equivalent VoiceXML code does not include the typeinformation.

To facilitate the reverse translation from VoiceXML code to dialog, thedialog transformer 204 modifies the VoiceXML code by insertingadditional attributes into various element tags, providing dialogelement information that cannot be stored using the available VoiceXMLtags. The resulting file 214 is effectively in an extended VoiceXMLformat. The additional attributes are stored in a separate, qualifiedXML namespace so that they do not interfere with the standard VoiceXMLelements and attributes, as described in the World Wide Web Consortium's(W3C) Namespaces in XML recommendation, available athttp://www.w3.org/TR/1999/REC-xml-names-19990114/. This facilitates theparsing of extended VoiceXML files.

Specifically, an extended VoiceXML file can include the followingnamespace declaration:

-   -   <vxml version=“1.0”        xmlns:lq=“http://www.telstra.com.au/LyreQuest”>

This defines a namespace prefix “lq” as bound to the universal resourceindicator (URI) http://www.telstra.com.au/LyreQuest. Subsequently, thefile may contain the following extended VoiceXML: <formid=“SUBROUTINECALL_getMembership”   lq:element=“SubroutineCall”  lq:name=“getMembership”   lq:calls=“sub1.vxml#getMembershipsub”> <subdialog name=“subcall”   src=“sub1.vxml#SUBROUTINE_getMembershipsub”>   <filled>    <assignname=“getMembership.enrich_membership”    expr=“subcall.enrich_membership” lq:element=“Output”/>   </filled> </subdialog>  <block name=“link”>   <goto next=“#REMOTE_book_flight”> </block> </form>where the indicated XML tag attributes provide the additional dialogelement information, and the remaining code is standard VoiceXML. Theadditional or extended attributes include the lq namespace prefix. Thelq:element, lq:name, and lq:calls attributes indicate, respectively, thedialog element that the VoiceXML corresponds to, the name given to thatelement by the user, and the package and name of the Subroutine elementthat is being called by the SubroutineCall element. Other elements willhave different extended attributes.

The equivalent code in VoiceXML omits the extended attributes, but isotherwise identical: <form id=“SUBROUTINECALL_getMembership”> <subdialog name=“subcall”   src=“sub1.vxml#SUBROUTINE_getMembershipsub”>   <filled>    <assignname=“getMembership.enrich_membership”    expr=“subcall.enrich_membership”/>   </filled>  </subdialog>  <blockname=“link”>   <goto next=“#REMOTE_book_flight”/>  </block> </form>

Two extended VoiceXML files, including all the available extendedattributes, are listed in Appendix B.

When the application is saved, the dialog transformer 204 also generatesa number of other files, including a project file 216, packagedescription files 218, and type description files 220. The project fileis given a filename extension of “.lqx”, and contains information aboutthe packages (i.e., self-contained groups of files) and other data filesmaking up a project of the voice application development system 100.

An example project file is listed below. Within the project file, theproject is defined by a “project” XML element that defines the projectname as “mas”. Within the “project” element are four sequential “folder”elements that define subdirectories or folders of the directorycontaining the project file, respectively named Packages, Transcripts,Scenarios, and Generated Code. These folders contain respectively theproject's packages, transcripts of text, scenarios of interactionbetween the corresponding application and a user, and VoiceXML code andgrammar generated for one or more specific IVR platforms. Within the“Packages” folder element is a “package” element giving the location andname of any packages used by the project. The “folder” elements cancontain one or more “file” elements, each defining the type and name ofa file within the encapsulating folder. The “folder” elements can benested. <?xml version=“1.0”?> <project name=“mas”>  <foldername=“Packages” directory=“packages”>   <package directory=“mas”file=“mas.pkg.xml”/>  </folder>  <folder name=“Transcripts”directory=“transcripts”>   <file type=“transcript” name=“mas.in”/>  <file type=“negative” name=“mas .negative”/>  </folder>  <foldername=“Scenarios” directory=“scenarios”>   <file type=“scenario”name=“mas.scen”/>   <file type=“states” name=“mas.states”/>  </folder> <folder name=“Generated Code” directory=“deploy”>   <folder name=“JSGFCode” directory=“jsgf”>   </folder>   <folder name=“Nuance Code”directory=“nuance”>   </folder>  </folder> </project>

A package description file is given a filename extension of “.pkg.xml”,and contains information about data files belonging to an individualPackage of a project. An example of a package description file for thepackage named “mas” is given below. The file defines the project'sdialog file as “mas.vxml”, four grammar files, four prompt files, andthree type definition files, containing definitions of user-definedvariable types. These files are described in more detail below. <?xmlversion=“1.0”?> <package name=“mas”> <file type=“dialog”name=“mas.vxml”/> <file type=“grammar”>  <file type=“rulelist”name=“mas.rulelist”/>  <file type=“cover” name=“mas.cover”/>  <filetype=“slots” name=“mas.slots”/>  <file type=“targets”name=“mas.targets”/> </file> <file type=“prompt”>  <file type=“rulelist”name=“mas.prompts.rulelist”/>  <file type=“cover”name=“mas.prompts.cover”/>  <file type=“slots”name=“mas.prompts.slots”/>  <file type=“targets”name=“mas.prompts.targets”/> </file> <file type=“typedef”name=“cities.type.xml“/> <file type=“typedef”name=“fare_class.type.xml”/> <file type=“typedef”name=“collection_point.type.xml”/> </package>

A Type description file is given a filename extension of “.type.xml”,and contains information about a user-defined Type used in a Package ofa project. An example of the file is given below. The file defines anenumerated type named “fare-class” with three possible values: “first”,“business”, and “economy”. The “fare_class” type is associated with fourfiles, respectively defining rules for the grammar, cover (a set ofexample phrases), slots (the parameter=value fields that the grammar canreturn), and targets (more specific slot filling information). <?xmlversion=“1.0” encoding=“utf-8”?> <types>  <enum name=“fare_class”>  <file type=“grammar”>    <file type=“rulelist”  name=“fare_class.rulelist”/>    <file type=“cover”name=“fare_class.cover”/>    <file type=“slots”name=“fare_class.slots”/>    <file type=“targets”name=“fare_class.targets”/>   </file>   <item name=“first”/>   <itemname=“business”/>   <item name=“economy”/>  </enum> </types>

Returning to FIG. 3, in order to deploy the application on the WR 102,the application dialog is translated into VoiceXML by the applicationbuilder 206 at step 604. In addition to the dialog, voice applicationsrequire grammar and prompts. The application builder 206 generatesgrammar files 222 and prompts files 224 automatically, using informationspecified by the user and stored in the dialog, such as prompt wordingsand Form input types. This information is supplied by the user enteringtypical phrases for ‘mixed initiative’ recognition (i.e., inputcontaining multiple pieces of information). By applying generalisationmethods to these phrases, a combinator module of the applicationdevelopment modules 210 generates a starting grammar set capable ofhandling a large number of input phrases. The application builder 206also invokes the dialog transformer 204 to create the extended VoiceXMLfile 214. The grammar 222 and prompts 224 files are used by the IVR codegenerator 208 to generate VoiceXML 226 for the IVR 102.

A generated grammar file is given a filename extension of “.rulelist”.An example of a generated grammar file for a flight booking system is:.Ask_flight_details_destination Cities:X 2 0 destination=$x.cities.Ask_flight_details_departure_point Cities:X 2 0 departure_point=$x.cities .Ask_flight_details_ticket_class Fare_class:X 2 0ticket_class=$X.fare_class .Ask_flight_details_date Date:X 2 0date.day=$X.date.day date.year=$X.date.yeardate.day_of_week=$X.date.day_of_week date.month=$X.date.monthdate.modifier=$X.date.modifier .Form_flight_details booking 2 1.Ask_book_flight_confirmation Confirmation:X 2 0confirmation=$X.confirmation .Ask_get_enrich_number_enrich_numberDigitstring:X 2 0 enrich_number=$X.digitstring.Ask_get_collection_point_collection_point Collection_point:X 2 0collection_point=$X.collection_point .Ask_another_flight_second_flightConfirmation:X 2 0 second_flight=$X.confirmation.Ask_get_number_get_number_for_balance Digitstring:X 2 0get_number_for_balance=$X.digitstring .Menu_main_menu_booking Booking 20 ! !Booking booking 3 0 ! !Booking 1 0 .Menu_main_menu_membershipMembership 2 0 ! !Membership membership 3 0 ! !Membership 1 0.Form_flight_details GF_IWantTo:X738 book a Fare_class:X741 class ticketto Cities:X745 from Cities:X747 on Date:X749 2 1 date.day=$X749.date.daydate.year=$X749.date.year ticket_class=$X741.fare_classdate.day_of_week=$X749.date.day_of_week date.month=$X749.date.monthdeparture_point=$X747.cities destination=$X745.citiesdate.modifier=$X749.date.modifier .Form_flight_details GF_IWantTo:X750book a Fare_class:X753 class ticket to Cities:X757 2 1ticket_class=$X753.fare_class destination=$X757.cities

The first line or rule of this grammar can be used as an example:

-   .Ask_flight_details_destination Cities:X 2 0 destination=$X.cities

This grammar rule might be invoked when a flight booking applicationprompts a customer to provide the destination of a flight. The firstfield, .Ask_flight_details_destination, provides the name of the grammarrule. The second field, Cities:X, indicates that the customer's responsex is of type Cities. This type is defined by its own grammar thatincludes a list of available city names. The following two fields, 2 0,are used for grammar learning, as described in International PatentPublication No. WO 00/78022, A Method of Developing ant InteractiveSystem. The first field indicates the number of training examples thatuse the grammar rule. The second field indicates the number of otherrules that refer to the rule. The last field, destination=$X.cities,indicates that the result of the rule is that the parameter destinationis assigned a value of type Cities having the value of x. A more complexexample is provided by the last rule:

-   .Form flight details GF_IWantTo:X750 book a Fare_class:X753 class    ticket to Cities:X757 2 1 ticket_class=$X753.fare_class    destination=$X757.cities

In this case, the grammar rule invokes three other grammars: GF_IwantTo,Fare_class, and Cities and assigns the results to parameters named X750,X753, and X757, respectively. This rule defines the applicationparameters ticket_class and destination.

A prompts file is given a filename extension of “.prompts.rulelist”, andeach line of the file defines the speech prompt that is to be providedto a user of the application when the corresponding element of thedialog is executed. An example of a generated prompts file is: .Goodbyethank you for using this application. goodbye. 1 1 .Noinput sorry, i didnot hear anything. 1 1 .Nomatch sorry, i did not understand you. 1 1.Ask_flight_details_destination where would you like to go 1 1.Ask_flight_details_destination_2 please say the destination city name 11 .Help_Ask_flight_details_destination please say the destination. 1 1.Ask_flight_details_departure_point where would you like to fly from 1 1.Ask_flight_details_departure_point_2 please say the departure point 1 1.Help_Ask_flight_details_departure_point please say the departure point.1 1 .Ask_flight_details_ticket_class what class would you like to fly 11

The format of the prompts file is the same as the grammar file. Thisallows the prompts to be improved through machine learning as thoughthey were a grammar, using a grammar learning method such as thatdescribed in International Patent Publication No. WO 00/78022, A Methodof Developing an Interactive System.

The generated prompts include dynamically prompts. An example of adynamic prompt is: “You have selected to buy Telstra shares. How many ofthe Telstra shares would you like to buy?”. The word, “Telstra” isdynamically inserted into the application's prompt to the user.

The voice application development system 100 generates text-to-speech(TTS) prompts within the VoiceXML code that are evaluated on the fly.Although VoiceXML syntax allows an expression to be evaluated and playedas a TTS prompt, the system 100 extends this by allowing an ECMAscriptor JavaScript function to be called to evaluate each variable used in aprompt. By evaluating variables in a function rather than as an inlineexpression, complex test conditions can be used to determine the mostsuitable prompt given the available information in the variables. Thismight result in a prompt, for example, of “six dollars” rather than “sixdollars and zero cents”. In addition to automatically generating andincorporate JavaScript function calls in VoiceXML, the system 100 alsogenerates the corresponding JavaScript functions by incorporatinguser-supplied prompt text and variables into the JavaScript templates230. This allows the user to develop a voice application withdynamically generated prompts without having to manually code anyJavaScript.

For example, an automatically generated function call for a prompt namedPromptConfirm_payment_details is: <field name=“confirm”>  <grammarsrc=“mas.gsl#Confirm”/>  <prompt>   <valueexpr=“PromptConfirm_payment_details(    payment_details.company,   payment_details.amount,    payment_details.payment_date)”/> </prompt>

The corresponding JavaScript prompt function generated by the system 100is: function PromptConfirm_payment_details(company,amount, payment_date){  var result;  result = “the company is ” + Bpay_names(company) + “theamount is ” + Money(amount) + “the payment date is ” +Date(payment_date) + “is this correct? ”;  if( valid_string(result))  {   return result;    }   ;    return result;  }

The system 100 represents prompts using a language model that describesall of the prompts that can be played, along with their meanings. Thismodel contains the same type of information as a speech recognitiongrammar, and therefore the prompts to be played can be represented usinga grammar. Prompts to be generated by the application are firstrepresented as a grammar to enable that grammar to be improved usingtechniques such as grammar learning, as described in InternationalPatent Publication No. WO 00/78022, A Method of Developing anInteractive System. The grammar is subsequently converted intoJavaScript and referenced by the application's VoiceXML tags, asdescribed above.

An example of a prompt represented as a grammar is:.Confirm_payment_details the company is Bpay_names:x1 the amount isMoney:x2 the payment date is Date:x3 is this correct? 1company=$x1.company amount.dollars=$x2.amount.dollarsamount.cents=$x2.amount.cents payment_date=$x3.payment_date

Returning to FIG. 3, after the application has been built at step 604,that is, the extended VoiceXML 214, the grammar 222 and the prompts 224for the application have been generated by the application builder 206,the application can be tested and further developed at steps 606 to 610.Steps 606, 608, 610 and 612 can be executed in any order. At step 606,the application can be simulated and refined. This involves simulatingthe execution of the application and refining its accuracy by allowingthe user to tag phrases that do not match the existing grammar. Whenuser input during testing does not match the grammar, a dialog box isdisplayed, allowing the user to tag the phrase and supply the correct‘slot’ or slots corresponding to that input. A slot is a parameter=valuepair for the application, such as “fare_class=business”. A grammarlearner module of the application development modules 210 then uses atranscript of the simulation to update the application grammar 222. Newphrases learnt from this grammar are then displayed to the user, who canmanually tag individual phrases as being incorrect. At step 608, thegrammar capabilities of the application can be further improved by aTrain Grammar tool of the application development modules 210. This issimilar to the simulate and refine module, but allows the user to entera list of typical user responses for each application prompt. At step610, a Generate Scenarios module of the application development modules210 generates scenarios of possible interactions between the applicationand a human. Based on these scenarios, the user can determine whetherand application prompts need improvement.

When the application has been tested and is ready for use, the IVR codegenerator 208 executes a code generation process at step 612 to generatepure VoiceXML suitable for a particular speech-enabled IVR such as theIVR 102 of FIG. 1. As shown in FIG. 6, the code generation processbegins at step 702 by removing the extended attributes from the extendedVoiceXML file 214 to generate pure VoiceXML. At step 704, prompts(including dynamic prompts) in the prompt file 224 are converted intoJavaScript functions. At step 706, these JavaScript functions areincorporated into the pure VoiceXML by adding references to thefunctions in VoiceXML tags, and adding the functions themselves to thepure VoiceXML. At step 708, the IVR grammar file 228 is generated bytranslating the application grammar file 222 into a grammar formatsupported by the desired IVR platform, such as Nuance™ GSL or genericVoiceXML 1.0 JSGF grammar format, as selected by the user. Other grammarformats can be supported in the same way. At step 710, references to theIVR grammar file 228 are incorporated into the pure VoiceXML. The resultis the pure VoiceXML file 226. The VoiceXML file 226 and the IVR grammarfile 228 are sufficient to deploy the voice application on the IVR 102.

For the purposes of illustration, Appendix C provides a partial listingof a pure VoiceXML file corresponding to the first extended VoiceXMLfile listed in Appendix B. The listing in Appendix C includes theVoiceXML with the merged JavaScript for supporting Prompts. TheJavaScript code is at the end of the listing.

Many modifications will be apparent to those skilled in the art withoutdeparting from the scope of the present invention as herein describedwith reference to the accompanying drawings.

Appendix A: Dialog Elements Decision

Description: A point in the dialog where a decision is made to determinewhich path of execution should continue. Notes: Each Decision element isfollowed by one or more Decision-Branch elements. Parameters: DecisionName: a name for this Decision element

Decision branch

Description: A path that is followed if the condition is true. Notes:Can only be added after a Decision element. Parameters: BranchCondition: an ECMAScript expression that if evaluated to true will causeexecution to continue down this branch, eg. x == 2, validity == true,currency == ‘AUD’. Note that strings should be quoted in single quotes.An “else” condition is created by enabling the Otherwise checkbox.

Menu

DESCRIPTION: Prompts the user to select from a series of menu choicesNotes: Each Menu element is followed by one or more Menu-Choice. Promptsfor the menu can include: (1) Auto generated prompts (by default); (2)User specified prompts; (3) Audio file for playing prompts. In addition,tapered prompts are supported. Parameters: Menu Name: name for this Menuelement. Prompts Tapered prompts: the tapered prompts played to ask fora menu choice. The system can generate a default text prompt based onthe menu choices and a default audio file name for recording thecorresponding audio prompt, or you may enter in your own values. If anaudio prompt is not required, the “Generate default audio file name”checkbox should be unchecked and the “Audio file” text box should beleft empty. Help prompt: the prompt played when the user requests help.As with tapered prompts, the system can generate a default text promptand audio file name, or you may enter in your own.

Menu choice

Description: Defines a specific menu choice Notes: Can only be addedafter a Menu element. This is the only Element whose name can have aspace. Parameters: Choice Choice: a word or phrase for this menu choice.

Form

Description: Collects user inputs to fill in a ‘form’. Defines a typeand prompts for each input. These can be standard types such as Time andDate, or user defined such as a list of products. Notes:User-defined-types need to be first created in the LHS Dialog/ProjectFiles window, by right- clicking on ‘package’ icon. Selecting “Hot link”allows jumping to the Form from anywhere in the dialog. The Formssupport mixed initiative filling of the Fields. Other Elements can referto Form Fields using: “FormName.FieldName”. When entering Form Fields,default string values need to be in single quotes (eg. ‘telstra’).Parameters: Form Name: a name for this Form element. An object with thisname will be created to store the form fields. Each field can then beaccessed as $formName.$fieldName, eg. BuyForm.price. Modal: if enabled,then only that input field's grammar is enabled when collecting eachfield; all other grammars are temporarily disabled (including hot-linkedglobal grammars). This means that you cannot get out of this form bysaying a hotword. For example, a form that collects a login pin shouldbe made modal. Confirm All: enable confirmation of all form fieldstogether at the end. Hot link: enable jumping to the form from anywherein the document, by creating a document grammar for the form. InputField Name: name of slot to be filled. Type: type of slot value.Default: default value for slot. If a default is given, the user willnot be prompted for the field. If the field type is a structured object,the slot will have multiple properties, one for each of the object'sproperties. You may specify a default by filling in some or all of theseproperties, depending on what is valid for a particular type. Note thatstrings should be quoted in single quotes, eg. ‘wednesday’. Confirm:enable individual confirmation of this field. This is in addition to theform confirmation of all fields at the end. Input Field Prompts Taperedprompts: the tapered prompts played to collect input for this field. Thesystem can generate a default text prompt based on the field name and adefault audio file name for recording the corresponding audio prompt, oryou may enter in your own values. If an audio prompt is not required,the “Generate default audio file name” checkbox should be unchecked andthe “Audio file” text box should be left empty. Help prompt: the promptplayed when the user requests help. As with tapered prompts, the systemcan generate a default text prompt and audio file name, or you may enterin your own.

Record

Description: Captures a voice recording from the user Notes: Note thatthe Record duration is in seconds. Setting it to too high will result inexcesive memory usage & possible harddisk limit problems. PARAMETERS:Record Name: a name for this Record element. MIME-type: MIME-typestorage format for the recording. This field may be left blank if theIVR does not allow a MIME-type specification. Please consult your IVRdocumentation for supported MIME-types. Beep: enable a beep to besounded before recording starts. DTMF Terminate: enable a DTMF key pressto terminate the recording. Duration: maximum recording time in seconds.Confirm: enable confirmation of the recorded material. Prompt: theprompt played to ask for a recording. The system can generate a defaulttext prompt and a default audio file name for recording thecorresponding audio prompt, or you may enter in your own values. If anaudio prompt is not required, the “Generate default audio file name”checkbox should be unchecked and the “Audio file” text box should beleft empty.

Speaker

Description: Plays TTS & audio to the caller. Notes: Can say variablesof predefined types, such as Date, Money, etc. The variables need to bedeclared as a valid object (unless it is created via a Form Field). Eg.User can use local processing to declare a variable called price, asfollows: price = new Object(); price.dollars = 10; price.cents = 4; Touse it in Speaker, type the following: PromptMoney(price) in a Speakerexpression fragment. To play back predefined Types, such as Money: (1)add a Speaker fragment of the type “Expression”; (2) enter in the“Expression” text box: PromptType(x), where Type is the predefined typename, x is the variable of the predefined type. E.g. If “y” is typemoney (where y.dollars = 5 y.cents = 0), then entering PromptMoney(y)will result in the following being played: “five dollars”. Parameters:Speaker Name: a name for this Speaker element. Fragment Type: type ofprompt fragment. Text: a fixed text fragment. The text specifiestext-to-speech (TTS) that should be played. The audio file URL specifiesa recording that should be played, in precedence to the text-to- speech,eg. greeting.wav, http://server/welcome.wav. Either of the audio file orthe text may be left empty to request that only TTS or audio with noalternate text should be played. Expression: an evaluated promptfragment. The ECMAScript expression is evaluated and played astext-to-speech (TTS), eg. recordedMessage, counter + 1. To play back avariable of a known type, you should use the function PromptType(x),where Type is the name of that type, and x is the name of the variable,eg. PromptMoney(BuyForm.price). This is particularly important forplaying back structured types. The audio expression is evaluated andretrieved as an audio file URL for playing, eg. company + ‘.wav’. Theexpressions may reference any variable defined in the package, such asin a Variables element or a Form object. Either of the audio expressionor the TTS expression may be left empty to request that only TTS oraudio with no alternate text should be played.

Local Processing

Description: Local computation. Notes: This can be any ECMAScript code,eg. functions, statements, function calls. Any variable declared in thedialog may be accessed, eg. declared Variables, Form fields,SubroutineCall outputs, RemoteProcessing outputs. Parameters: LocalProcessing Name: a name for this Local Processing element. ECMAScript:any arbitrary ECMAScript code, such as variable declarations, if blocks,assignment statements, function declarations and function calls, eg. x =x + 1; currency = ‘AUD’;. Note that strings should be quoted in singlequotes.

Remote Processing

Description: Call a server-side script (via a HTTP URL) to perform someprocessing. The script should return a VoiceXML document with theresults of the processing. NOTES: Need PHP/CGI running on remote Webserver to handle this. Name of the input and output parameters needn'tbe declared in the Variables element. The names of the input and outputparameters should match what is required at the remote server end. OtherDialog Elements can refer to the output of the Remote Processing elementusing: “RemoteProcName.outputName”. See Chapter 9: Advanced Techniquesfor more information Parameters: Remote Processing Name: a name for thisRemote Processing element. An object with this name will be created tostore the returned outputs. Each output can then be accessed as$remoteProcessingName.$output, eg. ValidatePin.validity Source URL: theURL of the remote script to execute, eg. http://server/cgi-bin/script1.Input Name: name of the input parameter to be submitted, eg. price. Thisis the name that the server script expects. Value: value of the inputparameter to be submitted, eg. BuyForm.price, 1, ‘AUD’, true, x + 1.This may be any valid ECMAScript expression. The expression mayreference any variable defined in the package, such as in a Variableselement or a Form object. If the value is an ECMAScript object withfields f1, f2, . . . , then the object is serialised and submitted usingthe names o.f1, o.f2, . . . , assuming o is the declared input name.Output Name: name of each parameter returned by the remote script, eg.outField1. The remote script should return a VoiceXML documentcontaining a subdialog form whose return namelist includes all theseoutput fields.

Loop Call

Description: Calls a Loop to execute a part of a dialog that should beiterated several times. Notes: A corresponding Loop element is required.See Chapter 9: Advanced Techniques for more information Parameters: LoopCall Name: a name for this Loop Call element. Source: the loop to call.The loop must be defined in this package.

Subroutine Call

Description: Start a subroutine. Used for breaking bigger programs intosmaller components, for ease of readability, reuse of code, and supportof pre-packaged code. Notes: A corresponding Subroutine element isrequired. Input parameter names needn't be declared. Ouput parameternames are set by the Subroutine element. Other Dialog Elements can referto the output parameters by calling:“SubroutineCallName.OutputParamName”. See Chapter 9: Advanced Techniquesfor more information Parameters: Subroutine Call Name: a name for thisSubroutine Call element. An object with this name will be created tostore the returned outputs. Each output can then be accessed as$subName.$output, eg. ValidatePin.validity. Source: the subroutine tocall (qualified with the package name). If the declared subroutineinputs and outputs have changed, you may need to reselect the subroutinefrom the Source list box to refresh the displayed inputs and outputs.Inputs This is a list of the inputs that the called subroutine expects.For each input, you must specify an ECMAScript expression whose valuewill be passed to the subroutine, eg. 0, BuyForm.price, pinValidity,‘AUD’. It may reference any variable defined in the package, such as ina Variables element or a Form object. Note that strings should be quotedin single quotes. Note also that you cannot leave any input value blank.Outputs This is a list of the outputs that the called subroutinereturns.

Jump

Description: Jump to a predefined block. Can jump to any element in thesame MainDialog or Subroutine that has a name. Notes: Valid destinationsfor Jump: (1) Jumps within a MainDialog; (2) Jump from a Loop in aMainPackage to the MainDialog; (3) Within a subroutine. Parameters: JumpDestination: the destination element to jump to. You can only jumpwithin a main dialog or within a subroutine. You cannot jump to anelement in a loop or in another package. The available elements to jumpto are presented in the drop-down list box.

End

Description: Terminate the call. End of dialog Notes: Parameters: Causesexecution to terminate immediately Transfer

Description: Transfer the call to another number. Notes: Parameters:Transfer Name: a name for this Transfer element. Destination: a numberto dial or an ECMAScript expression that evaluates to such a number. Avalid number is a string of digits with optional spaces, eg. 1234567, 031234567. The number may optionally contain a protocol specifier. Pleaseconsult your IVR documentation for specific supported number formats.Connection Timeout: maximum time in seconds to wait for a connectionbefore a failure is reported.

Loop Break

Description: Used within a Loop to break out of the loop. Notes: Onlyvalid within a Loop Parameters: Causes execution to break out of theloop immediately. The loop exit message, if any, is not played

Loop Next

Description: Used within a Loop to indicate the end of one iteration.Notes: Only valid within a Loop Parameters: Causes execution of the nextiteration of the loop, if the loop test condition evaluates to true. Theloop step is executed before the condition is evaluated.

Subroutine Return

Description: Return from subroutine Notes: Only valid within aSubroutine Parameters: Return from a subroutine call.

Variables

Description: Declare “global” variables that can be accessed fromanywhere within this Package, eg. inside a Loop, LocalProcessing,Speaker element. Notes: Form Fields, Subroutine Call outputs and RemoteProcessing outputs do not need to be declared. They are automaticallycreated. Parameters: Variables Name: a name for the variable. The nameshould be unique within this package. Value: an ECMAScript expression,usually a constant, that sets the initial value for the variable, eg. 0,‘cat’, true. Note that strings should be quoted with single quotes. Thevalue may be left empty to create an ECMAScript undefined value.

Hotwords

Description: Create “hot words” that transition the user to a specifiedpoint in the dialog when they are uttered by the user. Notes: Allows theuser to jump from one part of the dialog to another if they want to dosomething else, for example. Hotwords can only be created in the MainPackage, not in Subroutine Packages. Parameters: Hotwords Hotword: aword or phrase that will trigger a jump. Destination: the destinationelement to jump to.

Start Dialog

Description: The entry point for the application. Notes: This isautomatically created when you create a new project. Only the firstpackage has a Start Dialog element. Each project must have one and onlyone Start Dialog element. You cannot add more Start Dialogs, nor deleteany. Parameters: Start Dialog This is the entry point for theapplication. Name: a name for this Start Dialog element. Quit Prompt:the prompt played when the user has requested to quit, after which thedialog will terminate. The system can generate a default text prompt anda default audio file name for recording the corresponding audio prompt,or you may enter in your own values. If an audio prompt is not required,the “Generate default audio file name” checkbox should be unchecked andthe “Audio file” text box should be left empty. No Input Prompt: theprompt played when no input is detected while the user is being askedfor a response. The user is then reprompted for a response. As with theQuit prompt, the system can generate a default text prompt and a defaultaudio file name, or you may enter in your own values. No Match Prompt:the prompt played when the detected user response cannot be recognisedby the system. The user is then reprompted for another response. As withthe Quit prompt, the system can generate a default text prompt and adefault audio file name, or you may enter in your own values.

Subroutine

Description: This is the entry point for a subroutine. Notes: Subroutineelements cannot be added to the Main Package. They can only be added tonon- Main Packages. More than one Subroutine can be added to a non-MainPackage. However, all Subroutine inputs and outputs must be uniquewithin the Package, ie. no two Subroutines can declare the same inputname or output name. Furthermore, no input name can be the same as anoutput name. The Subroutine element is created by (1) right-clicking onthe “Project” icon on the Dialogs/Project Files window (LHS) to add anew Package; and (2) right-clicking on a Package icon on the Dialogswindow (LHS) to add a new Subroutine. Each path in the Subroutine shouldend with a Return element, otherwise the Subroutine will not return tothe calling dialog. See Chapter 9: Advanced Techniques for moreinformation Parameters: Subroutine Name: a name for this Subroutineelement. Input Name: name of each input parameter expected, eg. pin. Theinput will be declared as a document variable that you can access fromanywhere within the package. You do not need to (re)declare it under theVariables element. Output Name: name of each return parameter, eg.validity. The output will be declared as a document variable that youcan access from anywhere within the package. You do not need to(re)declare it under the Variables element.

Loop

Description: A portion of the dialog that should be executed severaltimes. Notes: The Loop element is created by right-clicking on thePackage icon on the Dialogs window (LHS). Variables are freely sharedbetween the Loop body and the main dialog (of the same Package) as theyare in one VoiceXML document. See Chapter 9: Advanced Techniques formore information Parameters: Loop Name: a name for this Loop element.Loop Test: an ECMAScript expression that if evaluated to true will causeexecution of the next iteration of the loop, eg. counter < 5. Test atstart: If enabled, the test condition is evaluated at the start of theloop and the loop is equivalent to a while/for loop. If disabled, theloop body is executed before the test condition is evaluated and theloop is equivalent to a do-while loop. Exit message: a message to beplayed when the loop exits normally, eg. there are no more items. Thesystem can generate a default text prompt and a default audio file namefor recording the corresponding audio prompt, or you may enter in yourown values. If a message is not required, the “Generate default...”checkboxes should be unchecked and the “Audio file” and “TTS” text boxesshould be left empty. Loop Init Variables to be initialised at the startof the loop. Name: name of a variable, eg. counter. The variable musthave been created elsewhere, such as in the Variables element. Value: anECMAScripz expression that sets the initial value for the variable, eg.0. Loop Step Variables to be incremented before another iteration of theloop. The increment occurs before the test condition is reevaluated.Name: name of a variable, eg. counter. Value: an ECMAScript expressionto increment the variable, eg. counter + 1.

1. A process for developing a voice application, including: generatinggraphical user interface components for defining execution paths of saidapplication by arranging dialog elements in a tree structure, each paththrough said tree structure representing one of said execution paths;generating said dialog elements on the basis of predetermined templatesand properties of said dialog elements, said properties received from auser via said graphical user interface components, each of said dialogelements corresponding to at least one voice language template; andgenerating at least one voice language module for said application onthe basis of said at least one voice language template and saidproperties.
 2. A process as claimed in claim 1, wherein the voicelanguage templates include VoiceXML elements.
 3. A process as claimed inclaim 2, wherein said at least one voice language module includesextended VoiceXML elements including VoiceXML tags and additionalinformation to allow said dialog elements to be generated from said atleast one voice language module.
 4. A process as claimed in claim 3,wherein said additional information includes one or more attributes ofsaid VoiceXML tags.
 5. A process as claimed in claim 4, wherein said oneor more attributes include qualified names.
 6. A process as claimed inclaim 1, wherein each of said at least one voice language modulesincludes a reference to the next of said at least one voice languagemodules in an execution path of said application.
 7. A process asclaimed in claim 1, including generating a graphical representation ofsaid dialog elements and said execution paths on the basis of said atleast one voice language module.
 8. A process as claimed in claim 1,including generating extended VoiceXML code, prompt data, and grammardata for said application.
 9. A process as claimed in claim 8, whereinsaid prompt data is represented as a grammar, and said process includesimproving said grammar.
 10. A process as claimed in claim 1, includinggenerating at least one script for generating a prompt for saidapplication on the basis of one or more parameters supplied to saidscript.
 11. A process as claimed in claim 10, wherein said at least onescript is generated on the basis of at least one script template andprompt data defined for said prompt by a user.
 12. A process as claimedin claim 11, wherein said at least one script includes EMCAscript.
 13. Aprocess as claimed in claim 8, including generating VoiceXML code andIVR grammar data for execution of said application on an IVR system onthe basis of said extended VoiceXML code, prompt data, and grammar data.14. A system having components for executing the process of claim
 1. 15.A system having components for executing the process of claim
 1. 16. Acomputer readable storage medium having stored thereon program code forexecuting the process of claim
 3. 17. A system for use in developing avoice application, including: a dialog element selector for definingexecution paths of said application by selecting dialog elements andadding said dialog elements to a tree structure, each path through saidtree structure representing one of said execution paths; a dialogelement generator for generating said dialog elements on the basis ofpredetermined templates and properties of said dialog elements, saidproperties received from a user of said system, each of said dialogelements corresponding to at least one voice language template; and acode generator for generating at least one voice language module forsaid application on the basis of said at least one voice languagetemplate and said properties.
 18. A system as claimed in claim 17,wherein said selector is adapted to generate a graphical representationof said dialog elements and said execution paths on the basis of said atleast one voice language module.
 19. A system as claimed in claim 17,wherein said code generator is adapted to generate extended VoiceXMLcode, prompt data, and grammar data for said application.
 20. A systemas claimed in claim 19, wherein said prompt data is represented as agrammar, and the system includes one or more modules for improving saidgrammar.
 21. A system as claimed in claim 17, including a scriptgenerator for generating at least one script for generating a prompt forsaid application on the basis of one or more parameters supplied to saidscript.
 22. A system as claimed in claim 21, wherein said scriptgenerator is adapted to generate said at least one script on the basisof at least one script template and prompt data defined for said promptby a user.
 23. A system as claimed in claim 19, wherein said codegenerator is adapted to generate VoiceXML code and IVR grammar data forexecution of said application on an IVR system on the basis of saidextended VoiceXML code, prompt data, and grammar data.
 24. An extendedVoiceXML file generated by a system of claim
 17. 25. A graphical userinterface for use in developing a voice application, said interfaceincluding graphical user interface components for defining executionpaths of said application by arranging configurable dialog elements in atree structure, each path through said tree structure representing oneof said execution paths, and said dialog element components may includeone or more of: a start dialog component for defining the start of saidapplication; a variables component for use in defining variables forsaid application; a menu component for defining a menu; a menu choicecomponent for defining a choice of said menu; a decision component fordefining a decision branching point; a decision branch component fordefining a test condition and an execution branch of said decisionbranching point; a form component for defining a form to collect inputfrom a caller; a record component for recording audio a speakercomponent for playing prompts, a local processing component for defininglocal processing; a remote processing component for performingprocessing on a remote system; a loop component for defining anexecution loop; a loop call component for calling said loop; a loop nextcomponent for proceeding to the next cycle of said loop; a loop breakcomponent for breaking out of said loop; a subroutine component fordefining a subroutine; a subroutine call component for calling saidsubroutine; subroutine return component for returning from saidsubroutine; a jump component for defining a non-sequential executionpath to a dialog element; a transfer component representing the transferof a call to another number; a hotwords component for defining a word orphrase and a non-sequential execution path to a dialog element to befollowed upon receipt of said word or phrase; and an end component fordefining an end of said application.